home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Columbia Kermit
/
kermit.zip
/
newsgroups
/
misc.20020314-20021006
/
000028_ishikawa@yk.rim.or.jp_Mon Apr 8 16:49:57 EDT 2002.msg
< prev
next >
Wrap
Text File
|
2020-01-01
|
13KB
|
371 lines
Article: 13299 of comp.protocols.kermit.misc
Path: newsmaster.cc.columbia.edu!panix!jfk3-feed1.news.algx.net!dca6-feed2.news.algx.net!allegiance!newsfeed1.cidera.com!Cidera!newsfeed.media.kyoto-u.ac.jp!newsfeed.rim.or.jp!news.rim.or.jp!not-for-mail
From: Ishikawa <ishikawa@yk.rim.or.jp>
Newsgroups: comp.protocols.kermit.misc
Subject: Re: a bug on GNU/linux: speed reset to unintended value occasionally.
Date: Tue, 09 Apr 2002 02:35:48 +0900
Organization: Ye 'Ol Disorganized NNTPCache groupie
Lines: 347
Message-ID: <3CB1D4F3.10D79B1F@yk.rim.or.jp>
References: <3CAFF81C.8039CBF8@yk.rim.or.jp> <a8q34m$l3l$1@watsol.cc.columbia.edu>
NNTP-Posting-Host: pl251.nas911.n-yokohama.nttpc.ne.jp
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-2022-jp
Content-Transfer-Encoding: 7bit
X-Trace: news.rim.or.jp 1018287351 58894 210.139.98.251 (8 Apr 2002 17:35:51 GMT)
X-Complaints-To: root@rim.or.jp
NNTP-Posting-Date: Mon, 8 Apr 2002 17:35:51 +0000 (UTC)
X-Mailer: Mozilla 4.77 [en] (X11; U; Linux 2.4.17 i686)
X-Accept-Language: ja, en
Cache-Post-Path: duron!unknown@localhost
X-Cache: nntpcache 2.3.3 (see http://www.nntpcache.org/)
Xref: newsmaster.cc.columbia.edu comp.protocols.kermit.misc:13299
>
> Thank you for the excellent report. I was ready to start
> a big debugging session, only to find that my Linux PC won't
> turn on any more. There's not much I can do at the moment
> without a Linux PC where I have hands-on access to the serial
> port and can hook up test equipment.
Murphy's law strikes!
Anyway, here is an additional report I prepared...
---
Thank you for quick response.
I posted the repot on the weekend, and so I thought I needed to
wait until later Monday, etc..
A pleasant surprise to see a quick reply on weekend.
Anyway, I should not have mentioned CISCO in the bug report.
CISCO router DOES use the traditional bytesize, parity setting, etc.
and has nothing to do with the reported bug that seems to be
triggered by "set parity hardware".
You are right. If CISCO uses such strange communcation setting,
a lot more people would have requested the 8E1 settings in KERMIT
long before.
Anyway, to give a better background
information, here is a brief description of the
hardware configuration where the problem was observed.
The proprietary hardware box that is connected to a
linux PC at the office uses the 8E1 datasize setting.
At the office:
PC <----- serial cable --------------> HW box.
RedHat 7.2 8E1 (19200/38400 bps)
( It seems the kernel is 2.4.7).
At home
PC <--- serial port ---> nothing is connected at the moment.
Debian 2.2rx (x bein 5 or 6: I have upgraded from r2 on my own.)
Kernel is 2.4.17.
Too bad that your linux PC no longer boots?
Murphy's law strikes at the most inconvenient time.
While you try to resurrect your linux PC,
I have dug into the problem on my own.
The following discovery
might help you in locating the cause of the bug.
Here are a few findings. (Debugging was done on the
RedHat linux PC at the office.)
- The kermit binary on my RedHat PC seems to be
the one downloaded from Columbia. (Not someone's
RPM.)
- I have found a certain command sequence that shows
the bug repeatedly. (Shown below.)
[ Also, nice thing is that I also found that if I don't
enable "set parity hardware", the flipping of speed
didn't occur with the above
mentioned sequence.
A key clue to find out where the bug might be. ]
- I ran "strace" to trace the system calls during the
above mentioned command sequence and found that
a call to ioctl() indeed switches speed to 2400 (B2400)!!!
(Shown below.)
- Since I have forgotten how to invoke the debug features of
kermit, I inserted fprintf(stderr, ...) statements
in strategic places and found the
statement that invokes ioctl with incorrect speed setting.
I think we are very close to finding the cause of the bug.
Now the details.
1. The simplest sequence to find the bug on RedHat 7.2 PC
(I may have a .kermitrc file for the root account.
I am not sure, but the speed setting of 38400 might
suggest so.)
The following command sequence
repeatedly showed the flipping of speed on RedHat 7.2.
# strace -o /tmp/t.out ./wermit
C-Kermit 8.0.201, 8 Feb 2002, for Linux
Copyright (C) 1985, 2002,
Trustees of Columbia University in the City of New York.
Type ? or HELP for help.
(/home/ishikawa/KERMIT/) C-Kermit>set modem none
(/home/ishikawa/KERMIT/) C-Kermit>set line /dev/ttyS0
(/home/ishikawa/KERMIT/) C-Kermit>show comm
Communications Parameters:
Line: /dev/ttyS0, speed: 38400, mode: local, modem: none
Parity: none, stop-bits: (default) (8N1)
Duplex: full, flow: none, handshake: none
Carrier-watch: auto, close-on-disconnect: off
Lockfile: /var/lock/LCK..ttyS0
Terminal bytesize: 8, escape character: 28 (^\)
Carrier Detect (CD): Off
Dataset Ready (DSR): Off
Clear To Send (CTS): On
Ring Indicator (RI): Off
Data Terminal Ready (DTR): On
Request To Send (RTS): On
Type SHOW DIAL to see DIAL-related items.
Type SHOW MODEM to see modem-related items.
(/home/ishikawa/KERMIT/) C-Kermit>set parity hardware
(/home/ishikawa/KERMIT/) C-Kermit>show comm
show comm
Communications Parameters:
Line: /dev/ttyS0, speed: 38400, mode: local, modem: none
Parity: hardware even, stop-bits: (default) (8E1)
Duplex: full, flow: none, handshake: none
Carrier-watch: auto, close-on-disconnect: off
Lockfile: /var/lock/LCK..ttyS0
Terminal bytesize: 8, escape character: 28 (^\)
Carrier Detect (CD): Off
Dataset Ready (DSR): Off
Clear To Send (CTS): On
Ring Indicator (RI): Off
Data Terminal Ready (DTR): On
Request To Send (RTS): On
Type SHOW DIAL to see DIAL-related items.
Type SHOW MODEM to see modem-related items.
(/home/ishikawa/KERMIT/) C-Kermit>connect
Connecting to /dev/ttyS0, speed 38400
Escape character: Ctrl-\ (ASCII 28, FS): enabled
Type the escape character followed by C to get back,
or followed by ? to see other options.
CONNECT speed=38400 <--- a debug output of my own.
?Carrier required but not detected.
***********************************
Hint: To CONNECT to a serial device that
is not presenting the Carrier Detect signal,
first tell C-Kermit to:
SET CARRIER-WATCH OFF
***********************************
CI: error return here ckucns.c 1367 <--- a debug output of my own.
(/home/ishikawa/KERMIT/) C-Kermit>show comm
***** Now the speed is flipped into 2400 !!! ****
***** See below!
Communications Parameters:
Line: /dev/ttyS0, speed: 2400, mode: local, modem: none
Parity: hardware even, stop-bits: (default) (8E1)
Duplex: full, flow: none, handshake: none
Carrier-watch: auto, close-on-disconnect: off
Lockfile: /var/lock/LCK..ttyS0
Terminal bytesize: 8, escape character: 28 (^\)
Carrier Detect (CD): Off
Dataset Ready (DSR): Off
Clear To Send (CTS): On
Ring Indicator (RI): Off
Data Terminal Ready (DTR): On
Request To Send (RTS): On
Type SHOW DIAL to see DIAL-related items.
Type SHOW MODEM to see modem-related items.
(/home/ishikawa/KERMIT/) C-Kermit>quit
quit
Closing /dev/ttyS0...OK
[root@dell-w2k-note KERMIT]#
2. The strace output.
The strace output captured during the above command run.
I only show the relevant portion.
Please note the line marked with "*=>", i.e,
*=> ioctl(3, 0x5403, {B2400 -opost -isig -icanon -echo ...}) = 0
...
read(0, "connect", 1024) = 7
write(1, "connect", 7) = 7
read(0, "\n", 1024) = 1
write(1, "\n", 1) = 1
time(NULL) = 1018263977
ioctl(0, 0x5401, {B38400 opost isig -icanon -echo ...}) = 0
ioctl(0, 0x5401, {B38400 opost isig -icanon -echo ...}) = 0
ioctl(0, 0x5403, {B38400 opost isig icanon -echo ...}) = 0
alarm(0) = 0
rt_sigaction(SIGALRM, {SIG_IGN}, {SIG_IGN}, 8) = 0
write(1, "Connecting to /dev/ttyS0, speed "..., 39) = 39
write(1, " Escape character: Ctrl-\\ (ASCII"..., 51) = 51
write(1, "Type the escape character follow"..., 54) = 54
write(1, "or followed by ? to see other op"..., 40) = 40
ioctl(0, 0x5401, {B38400 opost isig icanon -echo ...}) = 0
ioctl(0, 0x5401, {B38400 opost isig icanon -echo ...}) = 0
ioctl(0, 0x5403, {B38400 -opost -isig -icanon -echo ...}) = 0
write(2, "CONNECT speed=38400\n", 20) = 20
ioctl(3, 0x5401, {B38400 -opost -isig -icanon -echo ...}) = 0
ioctl(3, 0x5403, {B38400 -opost -isig -icanon -echo ...}) = 0
ioctl(3, 0x5401, {B38400 -opost -isig -icanon -echo ...}) = 0
ioctl(3, 0x5402, {B38400 -opost -isig -icanon -echo ...}) = 0
ioctl(3, 0x5401, {B38400 -opost -isig -icanon -echo ...}) = 0
*=> ioctl(3, 0x5403, {B2400 -opost -isig -icanon -echo ...}) = 0
rt_sigaction(SIGINT, {SIG_IGN}, {0x80e666c, [INT],
SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGQUIT, {SIG_IGN}, {SIG_IGN}, 8) = 0
ioctl(3, 0x5415, [TIOCM_DTR|TIOCM_RTS|TIOCM_CTS|0x4000]) = 0
ioctl(0, 0x5401, {B38400 -opost -isig -icanon -echo ...}) = 0
ioctl(0, 0x5403, {B38400 opost isig icanon -echo ...}) = 0
write(1, "?Carrier required but not detect"..., 36) = 36
write(1, "********************************"..., 36) = 36
write(1, " Hint: To CONNECT to a serial de"..., 42) = 42
write(1, " is not presenting the Carrier D"..., 46) = 46
write(1, " first tell C-Kermit to:\n\n", 26) = 26
write(1, " SET CARRIER-WATCH OFF\n\n", 26) = 26
write(1, "********************************"..., 37) = 37
write(2, "CI: error return here ckucns.c 1"..., 36) = 36
ioctl(0, 0x5401, {B38400 opost isig icanon -echo ...}) = 0
ioctl(0, 0x5403, {B38400 opost isig -icanon -echo ...}) = 0
getpgrp() = 30970
ioctl(1, 0x540f, [30970]) = 0
...
3. Where the call takes place.
After I captured the above log,
I tried inserting fprintf() in some places and
see if I can identify the statement that calls the
ioctl() with incorrect speed setting.
It was a trial and error efforts, but in the end,
I could identify the statement.
Near the end of function ttvt() in the file ckutio.c,
there is a code that looks like the following
The code below is AFTER my insertion of fprintf() and
my own comment.
The "tcsetattr(ttyfd, TCSADRAIN, &tttvt)" call
is the problematic one.
...
#endif /* VEOL */
#ifdef Plan9
if (p9ttyparity('n') < 0)
return -1;
#else
fprintf(stderr,"CI: ttvt called %s, %d\n", __FILE__, __LINE__);
#ifdef BSD44ORPOSIX
errno = 0;
#ifdef BEOSORBEBOX
tttvt.c_cc[VMIN] = 0; /* DR7 can only poll. */
#endif /* BEOSORBEBOX */
fprintf(stderr,"CI: ttvt called %s, %d\n", __FILE__, __LINE__);
/* CI suspicious. */
x = tcsetattr(ttyfd,TCSADRAIN,&tttvt);
fprintf(stderr,"CI: ttvt called %s, %d\n", __FILE__, __LINE__);
debug(F101,"ttvt BSD44ORPOSIX tcsetattr","",x);
if (x < 0) {
debug(F101,"ttvt BSD44ORPOSIX tcsetattr errno","",errno);
return(-1);
}
#else /* ATTSV */
fprintf(stderr,"CI: tthflow called %s, %d\n", __FILE__, __LINE__);
The tcsetattr() call that follows "Suspicous" comment
is the culprit.
4. My educated guess.
Either
(a) tttvt is not initialized correctly, or
(b) when the hardware parity is on,
a few places where the variable hwparity
is referenced and tttvt is updated
may corrupt the data in tttvt in an unexpected
way.
Since the bug appears only when "set parity hardware"
(well, at least on RedHat), I guess the cause
(b) is more likely although I don't rule out
(a) also. Maybe both?
Hope this helps.
PS: BTW, one of these days, we can probably use one instance of Kermit
to simulate a device that uses 8E1 datasize setting. The
implementation seems to work more or less correctly albeit some bugs
like the one I have found out.
The current implementaions is good enough to test 8E1 data setting on
another platform, I think. Actually, this is how I tested the solaris
7 problems a couple of years ago. I connected the two ports of a
solaris 7 for x86 PC and tested file transfer, etc. and found
that Solaris needed to use POSIX tty handling or something.
PPS: Your suggestion of connecting the live device
at the end of the cable may be a valid one.
Unfortunately, the particular hardware box
has a nasty habit of not initializing the
serial terminal until about 40 or 50 seconds
after power up.
And the only way for me to tell if the hardware box
booted successfully is to
see a short greeting message that appears
on the serial line.
ONLY THEN I can begin typing certain commands from
kermit into the hardware.
So I needed to monitor this greeting message
and in so doing, I was forced to monitor
the serial line BEFORE
the signals from the hardware box come alive...
Hmm...